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1 .   INTRODUCTION 

All  of  us  have  looked  through  a  stack  of  periodicals  at  one  time 
or  another.  Maybe  we  were  writing  a  term  paper  or  perhaps  just  looking 
through  the  old  issues  of  Time  magazine  to  find  a  special  article  that  we 
just  knew  was  there,  but  didn't  really  know  which  issue. 

If  we  were  lucky,  what  we  were  looking  for  turned  out  to  be  a  cover 
story  and  not  too  much  time  was  involved.  More  often  than  not,  though,  we 
paged  through  the  whole  pile  and  all  we  ended  up  with  to  show  for  our  efforts 
was  a  handful  of  paper  cuts.   The  trouble  was,  that  even  when  we  were  done, 
we  couldn't  positively  say  that  what  we  were  looking  for  wasn't  there  because 
we  hadn't  actually  read  all  the  articles. 

But  you  say,  it's  our  own  fault,  what  we  should  have  done  first  is 
gone  to  the  Readers'  Guide  to  Periodic  Literature.  Well,  that  is  fine  advice 
if  the  topic  were  George  Wallace  being  shot  in  Maryland,  undoubtedly  it  will 
be  indexed  in  the  Guide.   But,  what  if  we  have  a  more  abstract  topic  such 
as  the  effect  of  the  Vietnam  War  on  the  Russian  economy?  There  is  no  such 
category  in  the  Guide,  so  either  you  go  through  all  the  articles  on  Russia, 
or  worse  yet,  you  look  at  all  the  articles  on  Vietnam  trying  to  find  mention  of 
the  Russian  economy.   Either  way  your  chances  of  finding  an  article  on  the 
topic  are  very  slim. 

Imagine,  however,  being  assigned  that  topic  for  a  report  but  with 
one  small  difference.   Instead  of  having  the  Readers'  Guide  to  Periodic 
Literature,  at  your  disposal  is  an  information  retrieval  system.   The  data 
base  for  the  system  consists  of  all  the  issues  of  Time  magazine  for  the 
past  two  years.  Using  the  system  you  are  able  to  search  the  articles  for 


words  and  phrases  which  appear  in  the  title  of  the  article  or  in  the  actual 
text  itself.  For  instance,  asking  for  a  search  on  the  word  "Vietnam"  would 
produce  a  list  of  where  all  the  articles  on  Vietnam  had  appeared.  It  would 
also  include  articles  which  just  mention  the  word.  That  means,  an  article 
on  a  sports  hero  who  spoke  out  against  the  war  would  also  be  in  the  list  as 
long  as  it  actually  contained  the  word  "Vietnam". 

Due  to  this  fact,  we  are  careful  to  never  ask  for  a  search  where 
the  number  of  successful  matches  between  text  and  search  pattern,  called 
"hits",  is  large.   For  instance,  if  we  really  wanted  articles  on  just  Vietnam, 
we  would  search  titles  of  articles  and  not  the  actual  text.   This  would 
reduce  the  number  of  unwanted  hits. 

How  would  this  help  you  find  the  effect  of  Vietnam  on  the  Russian 
economy,  though?  The  supervisor  of  the  information  retrieval  system  is  smart. 
It  realizes  that  a  single  word  or  phrase  in  an  article  is  not  enough  to  find 
articles  of  interest  for  most  people.   Therefore,  it  allows  multiple  searches 
in  single  sentences,  paragraphs  and  entire  articles.  For  example,  in  our 
case  we  want  all  occurrences  of  the  patterns  "Vietnam"  and  "Russian  economics" 
in  the  same  article.  To  be  on  the  safe  side,  we  might  put  in  variances  of 
Russia,  like  U.S.S.R.  or  something  like  that,  due  to  the  fact  that  only  exact 
matches  will  score  hits.  Naturally,  some  of  our  hits  will  be  extraneous  to 
our  topic.   The  sentence  "On  his  trip  to  the  Soviet  Union  the  President  will 
be  accompanied  by  Vietnam  expert,  Henry  Kissinger,  and  economic  adviser, 
John  Connally,  who  has  recently  studied  the  Russian  economic  plight, "  will 
score  a  hit. 

After  all  the  hits  have  been  found,  you  could  ask  the  system  to 
print  out  a  hard  copy  of  the  articles  on  a  line  printer. 


Even  if  no  hits  are  found,  the  system  has  been  helpful  for  you 
because  you  can  be  confident  that  there  are  no  articles  in  the  magazine 
touching  on  your  topic.   If  you  were  successful,  then  you  obtained  your 
information  with  far  less  effort  and  probably  with  greater  accuracy  (i.e. 
you  didn't  miss  any  additional  articles)  and  with  no  paper  cuts. 

Although  the  data  base  currently  is  not  nearly  as  large  as  in  the 
example  above,  this  is  typical  of  the  file  searching  problems  which  are 
being  researched  under  Professor  David  Kuck  at  the  University  of  Illinois. 
Currently,  the  actual  data  base  is  a  series  of  65  scientific  papers  stored 
on  a  25  million  bit  two  surface  disk.   The  computer  used  for  the  file 
searching  is  a  Burroughs  D- Ma chine  minicomputer.   The  machine  is  micro- 
programmable  with  IK  of  6k   bit  micro-memory  and  k¥L   of  l6  bit  main  memory 
words  (S-Memory) .   Instructions  are  stored  in  main  memory,  then  fetched  one  at 
a  time  and  interpreted  by  a  series  of  micro-instructions.   The  interpreter  was 
written  by  Hirohide  Yamada  [1]. 

The  actual  file  searching  supervisor  was  written  by  William 
Stellhorn.   The  program,  which  relies  heavily  on  overlay  structures  due  to 
the  limited  amount  of  main  memory,  was  written  in  an  assembly  language 
called  the  S-Language.   The  content  of  this  report  is  a  user's  guide  to  the 
S-Language  and  a  detailed  explanation  of  the  assembler. 

It  is  hoped  that  this  paper  will  enable  users  to  program  easily 
in  the  S-Language,  and  for  other  programmers  to  make  additions  to  the  S- 
Language  and  its  assembler. 


2.   THE  S- LANGUAGE 

2.1  Data  Format 

The  memory  of  the  D- Machine  is  ^096  words  long,  each  word  being 
l6  bits  in  length.  All  instructions  are  a  multiple  of  l6  bits  in  length. 
All  numeric  data  is  handled  in  fullword  integer  format.  Negative  numbers 
are  represented  in  two's  (radix)  complement  form.  Therefore  the  largest 
positive  number  which  may  be  represented  on  the  machine  is  32767  and  the 
smallest  negative  number  is  -32768. 

Character  data  is  stored  in  an  8  bit  ASCII  form,  thus  two  charac- 
ters may  be  stored  in  a  single  word  and  each  half  word  (8  bits)  is  called  a 
byte.   For  the  representation  of  the  ASCII  character  set  see  Figure  1.  All 
bit  numbering  is  done  right  to  left  in  a  word  starting  with  bit  0  up  to 
bit  15-   The  high  order  byte  of  a  word  contains  the  sign  bit  as  its  leftmost 
bit  and  therefore  consists  of  bits  8-I5.  The  low  order  byte  contains  bits 
0-7. 

2.2  Coding  Format  and  Continuation  Cards 

Each  instruction  must  start  on  a  new  card  and  all  information  must 
be  coded  in  columns  1-71  inclusive.  Any  number  of  continuation  cards  may  be 
used  for  a  single  statement.  To  indicate  that  a  statement  on  a  card  is 
continued,  any  mark  is  placed  in  column  72.   The  following  card  is  then  in- 
terpreted as  being  a  continuation  card. 

Coding  is  free  format  for  all  fields  except  mnemonics  and  labels. 
The  mnemonic  of  an  instruction  must  begin  in  column  10  of  a  card  and  must  be 
followed  by  at  least  one  blank  before  any  operands.  When  a  card  is  continued 
the  first  character  on  the  continuation  card  should  be  in  column  19-  It  is 
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Figure  1.  ASCII  Character  Set 


an  error  to  begin  a  continuation  card  before  column  19.  Any  column  after 
column  19  is,  of  course,  also  accepted.  Care  should  be  exercised  in  coding 
continuation  cards  for  character  literals  since  starting  the  literal  after 
column  19  on  the  continuation  card  will  introduce  unwanted  blanks  into  the 
literal. 

Labels  must  start  in  column  one  and  may  be  up  to  eight  alphanumeric 
characters  long,  the  first  of  which  must  be  alphabetic.   Combined  with  the 
restriction  on  the  starting  point  for  mnemonics,  this  means  column  nine  must 
be  blank. 

Variable  names  must  also  begin  with  an  alphabetic  character  and  may 
be  up  to  eight  alphanumeric  characters  long. 

No  imbedded  blanks  are  allowed  within  labels,  mnemonics  or  variable 
names. 

An  instruction  may  be  ended  by  leaving  the  card  blank  after  the 
last  operand  or  by  punching  a  semicolon  (;)  after  the  last  operand.  If  the 
semicolon  is  used  it  must  be  preceded  by  at  least  one  blank  and  then  the  rest 
of  the  card  may  be  used  for  a  comment,  as  it  is  ignored  by  the  assembler. 

2.3  Options 

The  assembler  is  designed  to  produce  object  code  for  either  the 
D-Machine  or  the  Burroughs  5500  simulator.   In  order  to  specify  the  type  of 
output  desired  from  the  assembler  an  OPTION  statement  is  used.  The  mnemonic 
'OPTION'  is  used  as  in  any  statement  and  is  followed  by  any  of  the  following 
operands  with  the  effect  listed  beside  it. 

NOPUNCH  no  object  deck  is  punched  only  a 

listing  is  produced  (default  if  no 
option  card  found) 


D-MACHINE  object  code  for  D-Ma chine  is  produced 

SIMULATOR  object  code  for  the  B-5500  is  produced 

An  option  statement  may  occur  anywhere  in  the  program.   In  case  more  than  one 
is  found  the  last  one  will  be  the  one  used.   If  no  option  card  is  found  the 
default  is  NOPUNCH. 

The  statement  is  not  executable  and  should  not  contain  a  label.  A 
label  on  such  a  statement  will  be  unknown  to  the  assembler  and  will  be  flagged 
as  undefined  if  referenced. 

Although  core  addresses  are  punched  on  each  object  card  produced  by 
the  assembler  the  simulator  does  not  use  the  address.   Therefore  STORAGE  in- 
structions produce  an  appropriate  number  of  zero  words  when  option  SIMULATOR 
is  used. 

The  form  of  the  object  decks  is  as  follows: 

SIMULATOR 

Each  word  of  core  is  preceded  by  a  blank  and  a  percent  sign  (^)  and 
is  six  octal  digits  long.  There  may  be  up  to  eight  words  on  a  card  with  the  °]0 
falling  in  columns  2,  10,  l8,  26,  Jk,  k-2,  50  and  58.  Columns  65-69  are  blank, 
column  70  has  a  slash  (/)  to  denote  the  end  of  the  code,  columns  71  and  72  are 
blank,  columns  73  and  7^  contain  the  characters  ' SM'  and  columns  75-80  contain 
the  memory  address  of  the  first  word  on  the  card  right  justified  in  base  10. 

d-machim: 

Columns  1-3  have  the  memory  address  right  justified  in  hexadecimal. 

Column  k   is  blank.   Column  5  contains  the  character  'S'.   Column  6  is  blank. 

Column  7  has  the  number  of  words  of  data  minus  one,  in  hexadecimal.   Column  8  is 

blank.   Columns  9-72  contain  the  data  in  the  EBCDIC  equivalent  of  their  hexa- 
decimal form. 


2.k     Reserved  Storage 

The  first  89  memory  locations  in  the  S-Memory  (0-88)  are  reserved 
for  special  purposes.  Thus  code  generation  always  "begins  at  word  89.  The 
locations  from  16  to  79  have  been  named  as  follows: 

Location  Name 

16-31      IAR0-IAK15,  Interrupt  Address  Registers 

32-^7       PTR0-PTR15,  Pointer  Registers 

48-63       CHR0-CHR15,  Character  Registers 

63-79       CTR0-CTR15,  Counter  Registers 

Throughout  this  paper  these  locations  will  be  referenced  by  their 
names  instead  of  their  locations. 

2.5  Operand  Format 

The  following  conventions  are  used  for  operands  and  addresses  occurring 
in  word  and  string  instructions.  Assume  A  is  a  label  on  core  location  100. 

Operand  Meaning 

A  Contents  of  core  location  100 

=A  Value  of  A,  i.e.  100 

*A  Contents  of  core  location  whose  address  is 

in  core  location  100  (single  indirect  ad- 
dressing) 
**A  Double  indirect  addressing 

<  A  >  High  order  byte  address  of  A,  i.e.  200 

«A  »  Low  order  byte  address  of  A,  i.e.  201 

150  Contents  of  core  location  150 


Operand 


*150 

**150 

=25 

<  150 

> 

«  150 

» 

=  :1A: 

'DON"T  FORGET  2  QUOTES' 


i 


Meaning 
Single  indirect  address 
Double  indirect  address 
The  number  25,  „ 

High  order  byte  address,  i.e.  300 
Low  order  byte  address,  i.e.  301 
Hexadecimal  number,  26   (if  number  speci- 
fied does  not  fill  field  zeros  are 
supplied  on  left) 
The  character  string  DON'T  FORGET  2  QUOTES, 
single  quotation  marks  are  put  into  character 
strings  by  using  two  single  quotation  marks 
The  value  of  the  location  counter 


Anywhere  the  label  A  appears  in  the  operand  formats  above  it  may 
be  replaced  by  A  plus  constant,  A  minus  constant,  $,  $  plus  constant  or  $  minus 
constant.  Where  'constant'  is  a  base  10  number. 


2.6  Word  Instructions 


[label] 


The  basic  form  of  the  Word  Instructions  is: 


OP  code     Operand  1,  Operand  2,  Operand  3 


They  primarily  use  the  three  address  format  where  an  operator  is 
applied  to  operands  one  and  two  and  the  result  is  placed  in  operand  three.   For 
the  bit  patterns  of  the  Word  Instructions  see  Figures  2-h. 

In  the  list  below  when  two  or  more  instructions  are  grouped  together 
the  last  mnemonic  usually  ends  with  a  V  indicating  an  overflow  instruction. 
For  these  instructions  in  addition  to  the  operation  described,  if  overflow 


10 


occurs  the  address  of  the  current  instruction  minus  one  is  stored  in  IARO  and 
program  control  is  passed  to  the  content  of  IAR1  plus  one.  All  other  instruc- 
tions ignore  overflow. 


16  bits 


16  bits    16  bit: 


16  bits 


OP  field 


Operand  1 


Operand  2 


Operand  3 


Figure  2.  Word  Instruction  Format 


15 

11+ 

13 

12 

11 

5 

h      3 

2    1 

0 

1 

1 

J 

I 

0 

N 

p 

3 

V 
F 

0 

T 

OP  code 

IL2 
1 

IL1 
1 

0 

Figure  3 •   OP  Field  Format 


For  those  instructions  which  do  not  indicate  otherwise,  if  the 
third  operand  is  omitted  then  it  is  assumed  that  the  first  operand  is  the 
target  of  the  operation  and  it  has  the  same  effect  as  having  duplicated  the 
first  operand  in  the  third  position. 

The  third  operand,  unlike  the  first  two,  is  not  doubly  indirect 
addressable  and  may  not  be  a  literal.  Therefore,  if  the  third  operand  is 
omitted,  the  first  operand  may  not  be  one  of  these  types  either. 

Operands  must  be  separated  by  commas  but  may  have  any  nuariber  of 
blanks  between  them.  Notice  that  this  is  not  permitted  in  String  Instructions 
(see  section  2.7). 

The  following  is  a  list  of  the  Word  Instructions,  the  number  of 
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words  of  core  each  instruction  uses  and  a  description  of  the  action  of  the 
instruction. 

Mnemonic  Function 

ADD  Add 

ADDV 

length  k   words 

The  sum  of  operand  1  and  operand  2  is  stored  in  the  location 

specified  by  operand  3 • 

AND  Logical  AND 

length  k   words 

The  logical  product  (conjunction)  of  operand  1  and  operand  2  is 
stored  in  the  location  specified  by  operand  3- 

CNVTB  Convert  to  binary 

length  k   words 

The  character  string  at  the  address  specified  by  operand  1,  of  the 
length  specified  by  operand  2  is  converted  to  binary  from  ASCII  and  stored  in 
the  location  specified  by  operand  5-  Byte  addressing  is  used  for  operand  1, 
if  operand  1  specifies  a  word  address  it  is  converted  to  the  byte  address  of 
its  high  order  byte.  Maximum  length  is  six  bytes. 

CNVTD  Convert  to  decimal 

length  k   words 

The  fullword  specified  by  operand  1  is  converted  to  its  ASCII 
equivalent  and  the  rightmost  number  of  bytes  as  specified  by  operand  2  are 
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stored  starting  at  the  address  specified  by  operand  3«   Operand  3  should  be  a 
byte  address  and  if  not  will  be  converted  to  the  byte  address  of  the  high  order 
byte  of  that  word.  Maximum  length  is  six  bytes. 

DEC  Decrement 

DECR 

DECRV 

length  3  words 

Only  one  or  two  operands  are  specified.  The  first  operand  minus 

one  is  stored  in  the  location  specified  by  operand  2,  or  if  operand  2  is 

omitted,  back  into  its  own  location.   Notice  that  DEC  and  DECR  are  the 

identical  instruction. 

EQUTV  Equivalence 

length  h   words 

The  not  of  the  exclusive  OR  of  operand  1  and  operand  2  is  stored 
in  the  address  specified  by  operand  3«   This  is  not  an  overflow  instruction. 

EX0R  Exclusive  OR 

length  k   words 

The  exclusive  OR  of  operand  1  and  operand  2  is  stored  in  the 
location  specified  by  operand  3* 

INC  Increment 

INCR 

INCV 

length  3  words 

Only  one  or  two  operands  are  specified.   The  first  operand  plus  one 

is  stored  in  the  location  specified  by  operand  2,  or  if  operand  2  is  omitted, 
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back  into  its  original  location.   Notice  that  INC  and  INCR  are  the  same  in- 
struction. 

JUMP  Branch  unconditionally 

length  2  words 

Transfer  to  the  location  specified  by  the  single  operand. 

JUMPEQ  Branch  on  equal 

length  k   words 

Compare  operand  1  and  operand  2,    if  they  are  equal  branch  to  location 
specified  by  operand  3-  All  three  operands  must  be  present. 

JUMPGE  Branch  greater  than  or  equal 

JUMPGEV 

length  k-   words 

Compare  operand  1  and  operand  2,    if  operand  1  is  greater  than  or 

equal  to  operand  2  then  transfer  to  the  location  specified  by  operand  3*   The 

compare  uses  a  subtract  so  overflow  may  occur.   JUMPGEV"  can  be  used  to  also 

check  for  overflow.  All  three  operands  must  be  present. 

JUMPLT  Branch  less  than 

JUMPLTV 

length  K   words 

Compare  operand  1  and  operand  2,    if  operand  1  is  less  than  operand 

2  transfer  control  to  location  specified  by  operand  3-  All  three  operands  must 

be  present. 

JUMPNEG  Branch  negative 

length  3  words 


17 

If  operand  1  is  negative  transfer  to  location  specified  by  operand  2 
Both  operands  must  be  present. 

JUMPNEQ,  Branch  not  equal 

length  k   words 

Compare  operand  1  and  operand  2,    if  they  are  not  equal  transfer 
control  to  the  location  specified  by  operand  3-  All  three  operands  must  be 
present. 

JUMPNZ  Branch  not  zero 

length  3  words 

If  operand  1  is  not  zero  transfer  control  to  the  location  specified 
by  operand  2.  Both  operands  must  be  present. 

JUMPPZ  Branch  positive  or  zero 

length  3  words 

If  operand  1  is  positive  or  zero  transfer  to  the  location  specified 
by  operand  2.  Both  operands  must  be  present. 

JUMPST  Branch  and  store  location  counter 

length  3  words 

The  address  of  the  next  instruction  minus  one  Is  stored  in  the 
location  specified  by  operand  1.   Control  is  passed  to  the  location  specified 
by  operand  2. 

JUMPZ  Branch  zero 

length  3  words 

If  operand  1  is  zero,  control  is  passed  to  the  location  specified 
by  operand  2.   Two  operands  are  required. 
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MOVE  Move 

length  3  words 

One  word  (l6  bits)  of  data  as  specified  by  operand  1  is  transferred 
to  the  location  specified  by  operand  2. 

NAND  Logical  NAND 

length  k   words 

Operand  1  and  operand  2  are  NANDed  together  and  the  result  stored 
in  the  location  specified  by  operand  3« 

NOR  Logical  NOR 

length  k   words 

Operand  1  and  operand  2  are  NORed  together  and  the  result  stored  in 
the  location  specified  by  operand  3 • 

NOT  Logical  NOT 

length  3  words 

Operand  1  has  all  its  bits  inverted  and  is  then  stored  in  the 
location  specified  by  operand  2,    or  if  operand  2  is  not  specified,  into  its 
original  location. 

OR  Logical  OR 

length  k   words 

The  logical  sum  (disjunction)  of  operand  1  and  operand  2  is  formed 
and  stored  in  the  location  specified  by  operand  3 • 

ROTR  Shift  right  circular 

length  h   words 

The  fullword  specified  by  operand  1  is  shifted  right  the  number  of 
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"bits  specified  by  operand  2  and  the  result  is  stored  in  the  location  specified 
"by  operand  3*   Bits  falling  off  the  right  side  of  the  word  are  placed  back  in 
on  the  left  side.  Be  sure  that  operand  2  is  a  numeric  literal  if  the  shift 
is  to  be  a  constant  amount.  The  instruction: 

ROTR  100,  3 

will  not  shift  the  word  at  location  100,  three  bits  to  the  right.   It  will  shift 
it  an  amount  specified  by  the  word  at  storage  location  3«  The  correct  in- 
struction is: 

ROTR  100,  =  3 

SHIFTL  Shift  left 

SHIFTR  Shift  right 

length  k   words 

Shift  the  fullword  specified  by  operand  1,  an  amount  specified  by 

operand  2  and  store  it  in  the  location  specified  by  operand  3«  As  bits  are 

shifted  right  (left)  out  of  the  word,  zeros  are  introduced  on  the  opposite 

side  of  the  word.  Be  careful  about  specifying  operand  2.   See  the  note  in 

the  ROTR  instruction. 

SUB  Subtract 

SUBV 

length  h   words 

Operand  2  is  subtracted  from  operand  1  and  the  result  stored  in 

the  location  specified  by  operand  3* 

2.7  String  Instructions 

The  String  Instructions  are  designed  to  do  character  manipulation 
work.  They  enable  the  programmer  to  search  for  strings  of  characters  in 
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other  strings,  to  move  blocks  of  characters  and  to  replace  the  individual 
characters  of  a  string  from  a  translate  table.  The  basic  form  of  each  string 
instruction  is  as  follows: 

[label]    OP  Code    Operandi,  Operand}   Jump  Control    fjump  Control)   [Mask] 

where  [  ]  is  optional 

f  }  is  optional  but  may  occur  more  than  once. 

The  coding  for  string  instructions  is  different  than  for  all  other 
types  of  instructions  because  there  are  a  variable  number  of  fields  for  many 
of  them.   Consequently,  a  blank  character  is  always  assumed  to  be  the  end  of  a 
field.   This  means  that  there  may  be  no  blanks  between  operands  in  a  field.  For 
example,  one  of  the  string  instructions,  search  forward,  might  look  like: 

Field  1         Field  2     Field  3      Field  k 
LABEL     SEAECHF   CHRl^,  CHR11,  PTB.8     0,  *0,  *0      0, +1         0,0 

A  blank  after  CHPJJ+  in  field  1  would  be  illegal,  as  would  blanks 
around  the  plus  sign  (+)  in  field  3«  Notice  that  only  operand  fields  are 
numbered,  thus  CHRlA  is  operand  1  in  field  1.  This  is  necessary  to  understand 
some  of  the  error  messages  produced  for  string  instructions. 

Continuation  cards  are  handled  identically  with  those  for  all  other 
instructions.   See  section  2.2.   Special  care  must  be  exercised  when  an  in- 
struction is  broken  in  the  middle  of  a  field.   If  the  continuation  card  is  not 
begun  in  column  19,  it  is  probable  that  a  blank  will  be  introduced  in  the  field 
which  would  cause  an  error  condition  and  prevent  assembly  of  the  instruction. 
2.7-1  Jump  Controls 

String  instructions  have  special  fields  to  indicate  actions  to  be 
taken  when  the  operation,  specified  by  the  mnemonic  of  the  instruction,  is 
completed.  These  fields  are  called  Jump  Controls.  There  may  be  up  to  three 
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jump  controls  for  some  string  instructions.  Each  control  represents  a  set  of 
actions  to  be  taken  depending  upon  which  of  a  set  of  conditions  occurred  during 
the  instruction. 

The  format  of  a  jump  control  is  a  single  field  containing  two  or 
three  operands.  The  first  operand  specifies  from  where  the  next  instruction 
will  be  fetched.   It  may  specify  an  interrupt  address  register  or  be  zero  (0) . 
If  an  interrupt  address  register  is  specified  using  the  form  IARn,  (0  <  n  <  15), 
then  the  next  instruction  is  fetched  from  one  plus  the  address  specified  in 
IARn.   If  zero  is  specified  the  next  instruction  following  the  current  one 
will  be  fetched.   Subsequent  operands  specify  how  pointers  are  treated.   Each 
string  instruction  has  one  or  more  pointers  associated  with  it.  These  usually 
point  to  character  strings .  They  may  be  updated  in  the  following  ways : 

*0  Leave  pointer  at  position  when  instruction  began 

0  Leave  pointer  at  position  after  operation  was 

completed 
+1  or  1       Increment  pointer  by  one  after  operation 
-1  Decrement  pointer  by  one  after  operation 

Remember,  operands  in  a  field  may  contain  no  imbedded  blanks . 
2.7-2  Instruction  List 

The  following  is  a  list  of  all  the  string  instructions,  a  model  of 
their  structure  and  an  explanation  of  their  function.   In  the  model  for  each 
instruction  alternatives  are  listed  below  one  another  for  some  operands.  Unless 
these  operands  are  enclosed  in  brackets,  [  ],  one  of  the  alternatives  must  be 
selected  for  each  operand.   For  example,  compare  the  SEARCHF  instruction  in 
section  2.7  with  the  model  for  the  SEARCHF  instruction  in  this  section.   In  the 
following  section  PD  is  an  abbreviation  for  data  pointer,  PK  for  key  pointer 
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and  JCn  for  Jump  Control  n. 

COMPARE  Compare  two  strings 

length  k   words 

PK   PD   Key        JC1         JC2 

[label]    C0MPARE    PTRn, PTRn, CHRn     IABn, *0,  *0  IARn, *0 

CHRn      CTRn       0  0  0  0  0 

+1  +1  +1 

-1  -1  -1 

(PD)  (PK)  (PD) 

The  string  whose  byte  address  is  contained  in  the  register  specified 
by  PK  is  compared  to  the  string  similarly  specified  by  PD  until  an  unmatched 
character  is  encountered  or  until  the  character  specified  by  Key  is  found.  Key 
is  taken  to  be  the  low  order  byte  (rightmost  8  bits)  of  the  register  it  specifies. 
If  Key  is  found  before  a  mismatch  JC1  is  used.   If  a  mismatch  is  found  JC2  is 
used.   For  JC1  operand  2  updates  PD,  operand  3  updates  PK.   For  JC2  operand  2 
updates  PD,  PK  may  not  be  updated  but  is  automatically  restored  to  its  original 
position.   The  difference  between  the  data  pointer  position  at  the  beginning  of 
the  instruction  and  just  before  the  update,  is  stored  in  CTR0. 

FIND  Find  a  character  string  in  a  second  string 

length  k-   words 


PK   PD  Keyl 

JC1 

JC2 

PTRn,  PTRn,  CHRn 

IARn, *0, *0 

IARn,*0 

CHRn      CTRn 

0  0  0 

0  0 

+1  +1 

+1 

-1  -1 

-1 

[label]    FIND      PTRn,  PTRn,  CHRn    IARn,*0,*0     IARn,*0   rCHRnl 

°J 


Key2 

"c 


(PD)  (PK)       (PD) 


The  string  specified  by  PD  is  searched  for  the  string  specified  by 
PK  until  either  the  two  strings  match  up  to  the  character  specified  by  Key2 
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or  until  Keyl  is  found  in  the  PD  string.  For  both  pointers  the  register 
contains  the  byte  address  of  the  string.   For  the  keys  the  register  must  contain 
the  character  in  its  low  order  byte.  If  the  match  is  successful,  (Key2  is 
found),  then  JC1  is  used.  If  Keyl  is  found  then  JC2  is  used.   If  Key2  is 
omitted  then  the  default  is  zero,  which  is  the  character  :00:.  Note  that  PK 
may  not  be  updated  using  JC2.  The  difference  between  the  data  pointer  position 
at  the  beginning  of  the  instruction  and  just  before  the  update,  is  stored  in 
CTRO. 


M0VFF 
M0VER 

length  3  words 


[label]    MOVEF 
M0VER 


Move  a  character  string 


PK   PD  Count 

PTRn, PTRn,  CHRn 
CHRn      CTRn 


JC 

IARn,  *0,  *0 

0  0  0 

+1  +1 

-1  -1 

L   (PD)  (PK)J 


The  number  of  characters  specified  by  the  count  plus  one  are  moved 
from  string  PD  to  string  PK.   The  count  is  determined  by  taking  the  register 
specified  modulo  256.   Thus  a  maximum  of  256  characters  may  be  moved  at  one  time 
The  registers  specifying  PD  and  PK  must  contain  the  byte  address  of  their 
respective  strings.  Move  forward  (M0VEF)  starts  at  PD  and  PK  and  works  to  the 
right  on  both  strings.  Move  reverse  (M0VER)  starts  at  PD  and  PK  and  works 
to  the  left  on  both  strings.   There  is  only  one  possible  action,  that  specified 
by  JC.   If  JC  is  not  specified  then  0,0,0  is  assumed.   The  difference  between 
the  data  pointer  position  at  the  beginning  of  the  instruction  and  just  before 
the  update,  is  stored  in  CTRO. 
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SEARCHF 
SEAECHR 

length  k   words 


Search  character  string,  forward  and  reverse 


For  search  forward  and  reverse  instruction  model,  see  Figure  5. 

The  characters  specified  by  Keyl,  Key2,  and  Key3  are  searched  for  in 
the  character  string  specified  by  PD,  which  must  be  a  byte  address.  Key3  must 
be  in  the  low  order  byte  of  the  specified  register.   Keyl  and  Key2  are  also  in 
the  low  order  bytes  of  their  registers  unless  they  are  preceded  by  an  asterisk 
(*) .   This  indicates  indirect  addressing.   The  key  is  then  taken  to  be  the 
character  pointed  at  by  the  byte  address  in  the  specified  register.   If  Key2 
is  not  specified  then  Keyl  is  duplicated  and  used  for  Key2  also.  Jump  Control 
1  corresponds  to  finding  Key3,  Jump  Control  2  corresponds  to  Keyl,  and  Jump 
Control  3  corresponds  to  Key2.   If  a  mask  is  specified  then  the  high  order  byte 
of  its  register  must  be  zero  and  data  is  ORed  with  the  mask  character  before 
comparing  with  Keyl  and  Key2.  No  masking  facility  is  available  for  Key3«  For 
SEARCHF  searching  proceeds  to  the  right  of  PD,  for  SEARCHR  it  proceeds  to  the 
left.   Even  if  Key2  is  not  specified,  Jump  Control  3  must  be.   The  difference 
between  the  data  pointer  position  at  the  beginning  of  the  instruction  and  just 
before  the  update,  is  stored  in  CTRO. 
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The  data  pointer  (PD)  points  to  the  string  to  be  translated  and  the 
key  pointer  (PK)  has  the  address  of  the  translation  table.  PD  and  PK  must  both 
contain  byte  addresses,  although  PK  may  point  to  either  byte  of  the  first  word 
of  the  translation  table.  Each  character  of  the  PD  string  is  used  as  an  offset 
into  the  translation  table.  See  Figure  1  for  the  hexadecimal  representation  of 
the  ASCII  character  set.   This  hex  number  is  used  as  the  offset.   It  is  then 
replaced  by  the  character  found  in  the  translation  table.   Each  reference  in  the 
translation  table  is  a  full  word  long  and  should  have  a  high  order  byte  of  zero. 
It  is  the  low  order  byte  which  is  used  to  replace  the  character  string.  Trans- 
lation is  continued  until  the  character  which  occupies  the  low  order  byte  of 
Key  is  encountered  in  the  character  string.   This  character  is  not  translated. 
If  the  jump  control  is  not  specified  it  is  assumed  to  be  0,0.  Note  that  the  key 
pointer  may  not  be  updated.  The  difference  between  the  data  pointer  position  at 
the  beginning  of  the  instruction  and  just  before  the  update,  is  stored  in  CTRO. 

2.8  Pseudo  Operations 

The  pseudo  operations  are  a  set  of  instructions  which  either  produce 
no  object  code  or  produce  object  code  which  is  not  executed  (i.e.  not  instruc- 
tions) . 

EQU  Equate  label 

[label]       EQU       expression 

The  form  of  the  EQU  is  given  above.  The  effect  of  the  instruction 
is  to  evaluate  the  operand  'expression'  and  to  enter  this  value  into  the 
location  associated  with  Mabel'  in  the  symbol  table.  Thus  the  instruction: 
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IDENT         EQU       89 

would  have  the  same  effect  as  placing  the  label  'IDEM"  on  the  first  state- 
ment of  the  program  (remember  code  is  generated  from  location  89  on  up) .  The 
form  of  'expression'  may  be  any  of  the  following: 

89  Base  10  number 

IDENT[±  constant]   Identifier  or  identifier  plus  or  minus  a  base 

10  number 

If  an  identifier  is  used  in  'expression'  it  must  have  appeared  as  a  label  on 
a  statement  previous  to  the  EQU  statement . 

STORAGE  Allocate  storage  locations 

[label]   STORAGE       length 

The  form  of  the  instruction  is  given  above.  The  number  of  full 
words  of  storage  to  be  allocated  is  specified  as  a  base  10  integer  in  'length' 
On  the  D-Machine  whatever  was  in  these  locations  will  remain  after  the  current 
object  code  is  loaded.  The  simulator  receives  the  correct  number  of  words  of 
zeros  since  its  memory  must  be  completely  specified. 

CONSTANT  Generate  constants 

[label]   C^STANT      length,  operands 

The  CONSTANT  instruction  enables  the  user  to  produce  any  bit 
pattern  desired  in  full  word  regions.  The  total  amount  of  core  used  by  the 
constant(s)  in  full  words  must  be  specified  as  a  base  10  number,  'length'. 
While  the  total  length  of  the  instruction  must  be  a  fullword  increment,  the 
individual  operands  may  be  half  words  (bytes)  long.  Any  number  of  operands 
may  be  specified  of  the  type  listed  below  and  the  total  length  of  the  core 
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used  may  be  as  large  as  the  storage  of  the  machine.  Individual  operands  though 
must  be  within  the  limits  given  below.  All  operands  must  be  separated  from 
each  other  by  a  comma  and  must  be  of  the  following  form: 

'A  QUOTE  (")'  A  character  literal,  A  QUOTE  ('),  max 

length  133  bytes  including  all  quotes, 

may  start  and  end  on  half  word 
A  full  word  integer,  in  this  case  largest 

possible  positive  integer 
A  negative  integer,  occupies  full  word 
Address  of  VAR  [±  constant],  occupies  full 

word 
Byte  address  of  high  order  byte  of  (VAR  ± 

constant),  occupies  full  word 
Byte  address  of  low  order  byte  of  (VAR  ± 

constant),  occupies  full  word 
Hexadecimal  literal,  may  start  and  end  on 

half  word,  max  length  13 1  nibbles  (half 

bytes) 

When  the  previous  operand  ended  on  a  half  word  and  the  next  operand 
requires  a  full  word,  or  if  no  other  operands  are  present,  the  previous 
operand  is  expanded.  For  character  literals  this  means  adding  blanks  :20:. 
For  hexadecimal  numbers  it  means  adding  zero  bytes  (:00).   Note  that  the 
literal  in  the  example  above  requires  three  bytes  and  that  a  zero  nibble  must 
be  added  on  the  left  to  fill  out  the  three  bytes.  Thus  :093ABC:  and  :93ABC: 
are  equivalent. 


32767  or 
+32767 

.32768 

=VAR[ ±  constant] 

<  VAR[  ±  constant]  > 

«  VAR[ ±  constant]  » 

:  93ABC : 
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To  fill  a  N  word  field  with  zeros  use: 
CONSTANT  N,  :0: 

To  fill  a  N  word  field  with  blanks  use: 
CONSTANT  N,  '  ' 

If  too  few  core  locations  are  allocated  for  the  operands  then 
truncation  occurs  on  the  right  and  an  error  message  is  produced.  Correct 
object  code  of  the  length  specified  however,  is  generated. 

EJECT  Page  eject 

Causes  next  instruction  in  program  to  appear  at  the  top  of  a  new 
page.   This  helps  make  program  listings  more  readable. 

ORG  Set  value  of  location  counter 

[label]     ORG        expression 

The  ORG  instruction  evaluates  'expression'  and  sets  the  location 
counter  equal  to  it.   If  a  label  is  placed  on  the  instruction  it  receives 
the  same  value.  Valid  expressions  are  base  10  numbers  and  previously  defined 
variables  plus  or  minus  an  offset. 

END  End  statement 

This  statement  should  contain  no  label  and  causes  the  assembler  to 
stop  working  on  the  user's  program.  It  is  not  necessary  for  the  assembler  to 
find  an  END  statement,  but  it  is  a  cleaner  way  to  terminate  an  assembly  than 
an  end  of  file  condition.  The  starting  location  for  execution  may  be  specified 
as  an  operand  on  the  END  card.   If  used,  it  must  be  a  base  10  number  or  a  label 
plus  or  minus  an  offset.   If  no  END  card  is  found  or  no  starting  location 
specified,  S-Memory  location  89  is  assumed. 
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2.9  Comments 

A  card  with  an  asterisk  in  column  1  is  ignored  by  the  assembler  and 
is  intended  for  use  as  a  comment.  Column  72  is  not  checked  for  a  continuation 
mark,  on  a  comment  card. 

2 . 10  Input /Output 

There  are  four  i/O  devices  connected  to  the  D- Machine:   a  disk, 
teletype,  line  printer  and  a  card  reader.  The  next  set  of  instructions  enable 
the  programmer  to  control  these  devices. 

The  length  of  the  data  transfer  must  be  in  the  following  form: 

CTRn  Length  is  in  CTRn 

=CTRn  Stop  character  is  in  low  order  byte  of  CTRn 

='#'  Stop  character  is  #,  may  be  a  single  charac- 

ter only 
=:20:  Stop  character  is  :20:,  should  be  two  digits 

long 

If  a  stop  character  is  used  for  the  length  operand,  it  is  not  transferred 
during  the  i/o  instruction's  execution. 

Addresses  must  be  in  one  of  the  following  forms: 

Address  Word  Address 

*Address  Single  level  indirect 

*-*Address  Two  levels  indirect 

<  Address  >  High  order  byte  address 

«  Address  »  Low  order  byte  address 

Where  'Address'  may  be  a  label  plus  or  minus  an  offset,  or  a  base  10  number. 
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For  a  diagram  of  the  object  code  format  for  the  i/o  instructions 
see  Figures  6-8. 

The  following  is  a  list  of  the  i/O  instructions  grouped  by  device, 


Disk 

[label]     READ 
WRITE 

instruction  length  3  words 


DISK,  length,  address 


These  instructions  transfer  whole  sectors  of  data  to  and  from  the 
disk.  The  length  should  specify  the  number  of  sectors  to  be  transferred.  A 
sector  contains  approximately  1000  characters.  The  address  is  the  word  address 
of  a  Control  Block  for  the  operation.  Byte  addressing  is  not  allowed  for  this 
operand  and  will  result  in  an  error.   The  contents  of  the  Control  Block  are  as 
follows : 


Word 


Content 

S-Memory  byte  address  of  last  character 
read  (supplied  by  i/O  routine  after  READ, 
meaningless  for  WRITE) . 

Disk  address  of  first  sector  to  be  read  or 
written,  must  be  supplied  by  user.   See 
Figure  9  for  format. 

S-Memory  byte  address  where  first  charac- 
ter of  first  sector  is  to  be  read  from  or 
written  into,  must  be  supplied  by  user. 


S-Memory  byte  address  where  first  charac- 
ter of  second  sector  is  to  be  read  from  or 
written  into,  (supplied  by  i/O  routine  after 
READ,  meaningless  for  WRITE. 
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Word 


N+2 


Content 

S-Memory  byte  address  where  first  charac- 
ter of  Nth  sector  is  to  be  read  from  or 
written  into,  (supplied  by  i/O  routine  after 
READ,  meaningless  for  WRITE) 


Notice  that  if  N  sectors  are  to  be  transferred,  the  Control  Block  contains 
N+2  words. 


Teletype 


[label] 


READ 
WRITE 


TTY,  length,  address  [  FEED  ] 

[NOFEED] 


instruction  length  3  words 


These  instructions  print  character  strings  on  the  teletype  and  in 
turn  receive  them,  storing  them  in  S-Memory.  The  length  operand  for  the  tele- 
type instructions  specifies  the  length  of  the  string  in  bytes  (characters)  to 
be  transferred.   The  address  must  be  a  byte  address  of  the  S-Memory  location 
containing  the  first  byte  to  be  transferred  (for  a  WRITE)  or  the  receiving 
location  of  the  first  byte  (for  a  READ).   If  a  word  address  is  given  it  is 
converted  to  the  byte  address  of  the  high  order  byte  of  that  word.   Subsequent 
bytes  are  loaded  and  removed  contiguously.  A  carriage  control  is  optional  as 
a  second  field  for  the  instruction.  FEED  produces  a  carriage  return,  line 
feed  for  the  WRITE  instruction  only.  NOFEED  suppresses  this  action.   The 
default  is  FEED. 


Printer 


[label] 


WRITE 


LINE,  length,  address  [  FEED  ] 

[NOFEED] 


instruction  length  3  words 
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Figure  6.   i/O  Instruction  Object  Code  Format,  Word  1 


Flag  Block 


3* 


15 


Ik 


13 


12-8 
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Bits 


15 


11+ 


13 
12-8 

7-9 
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Purpose 
=0,  CTRn  has  length  of  data,  bits  3-0  contain  n 
=1,    There  is  a  stop  character 
=0,  Bits  3-0  have  CTR  register  number 

containing  stop  character 
=1,  Stop  character  is  bits  7-0 
=1,  Carriage  return  line  feed 
Unused 

Stop  character 
CTR  number 


Figure  "J.     i/o  Instruction  Object  Code  Format,  Word  2 


Address  Block 
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Bits 


15-14 


13-0 


Purpose 
=00,  Direct  addressing 
=01,  One  level  indirect 
=10,  Two  levels  indirect 
Address 


Figure  8.   i/o  Instruction  Object  Code  Format,  Word  3 
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Figure  9*  Disk  Address  Word 
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This  instruction  is  used  to  print  a  character  string  on  the  line 
printer.  The  length  should  specify  the  number  of  characters  (bytes)  to  be 
printed.  The  address  should  be  the  byte  address  of  the  first  character  in  the 
string.  If  a  word  address  is  used,  it  is  converted  to  the  byte  address  of  the 
high  order  byte  contained  in  that  word.  A  second  field  specifying  a  carriage 
control  is  optional.  If  FEED  is  used,  the  characters  are  printed  beginning  on 
a  new  line.   If  NOFEED  is  used,  printing  begins  at  the  next  location  in  the 
present  line.   The  default  is  FEED. 

Card  Reader 

[label]      READ  CARD,  length,  address 

instruction  length  3  words 

This  instruction  enables  the  user  to  transfer  a  character  string 
from  a  card  in  the  card  reader  to  a  S-Memory  location.   The  length  specification 
indicates  the  number  of  columns  read,  starting  with  column  one.  The  address 
is  the  byte  address  of  the  S-Memory  location  which  will  receive  the  character 
in  column  one.   If  a  word  address  is  specified  it  is  converted  to  the  byte 
address  of  the  high  order  byte  of  that  word. 

2 . 11  Errors 

Any  instruction  which  violates  the  syntax  rules  described  in  sections 
2.6-2.10  will  cause  an  error  message  to  be  printed  out  under  that  statement  in 
the  listing,  which  explains  the  first  error  detected  in  that  instruction.  The 
assembler  will  find  only  one  error  per  instruction  so  it  is  helpful  to  check 
the  entire  instruction  before  reassembling  any  instruction  so  flagged. 

When  an  instruction  contains  an  error  the  object  code  produced  for 
that  instruction  will  be  omitted  for  the  D-Machine  and  consist  of  zeros  for  the 
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simulator.  For  all  errors  except  an  undefined  mnemonic  the  location  counter 
will  "be  incremented  by  the  correct  amount,  so  patching  of  the  object  deck  is 
possible.  Undefined  mnemonics  cause  the  location  counter  to  be  incremented  by 
four. 

The  only  exception  to  the  above  is  errors  in  length  specified  on  a 
CONSTANT  instruction.  These  instructions  produce  correct  object  code  up  to 
the  length  specified  and  produce  a  warning  message  instead. 

At  the  end  of  the  assembly  the  number  of  errors  found  in  the  program 
is  listed. 

It  is  possible  that  a  run  of  the  assembler  might  cause  a  user  abend. 
A  list  of  the  abends  in  the  program  follows  below.   Only  numbers  1  and  100 
should  ever  occur. 


Abend 


Purpose 

Intermediate  file  overflow,  at  present 
maximum  length  of  program  is  1500  cards. 
To  increase  size  see  STORE  routine. 

Programming  error  in  assembler,  see  BLANKOP 
in  EVALUATE  routine. 

Programming  error  in  assembler,  see  SIMPUNCH 
routine . 

Programming  error  in  assembler,  see  DPUNCH 
routine. 


5 
100 


Illegal  character  found  in  SCAN  routine 

Program  contains  DUMP  statement  which  is  a 
debugging  aid.   Dump  occurs  during  pass  two, 
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3-   IMPLEMENTATION  OF  ASSEMBLER 

3.1  Assembler  Background 

The  S-Language  Assembler  was  written  in  360  Assembly  Language.   It 
consists  of  eleven  subroutines  (which  will  be  individually  described  later  in 
this  chapter)  and  is  about  5000  statements  in  length.  The  program  runs  in  a 
region  of  l80K  bytes  of  memory,  of  which  117K  is  the  intermediate  file.  By 
keeping  the  intermediate  file  in  core  and  having  been  coded  in  assembly  lan- 
guage, the  program  is  able  to  assemble  about  15,000  cards  a  minute. 

The  advantages  of  writing  the  program  in  assembly  language  include 
easy  character  manipulation  using  the  TRT  (translate  and  test)  instruction, 
plus  efficient  core  utilization.  The  real  advantage  of  assembly  language  pro- 
gramming can  best  be  seen  by  comparing  the  hashing  scheme  used  in  this  as- 
sembler, coded  in  assembly  language,  with  the  identical  scheme  coded  in  a  high 
level  language  like  PL/1.  The  assembly  language  version  actually  requires 
fewer  statements  and  as  one  would  expect,  beats  the  PL/.1  version  (actually 
PL/c)  in  execution  speed  [2] . 

A  primary  disadvantage  is  the  fact  that  the  development  time  of 
projects  written  in  assembly  languages  typically  is  longer  than  for  the  same 
project  to  be  coded  in  a  higher  level  language  [3l«  Furthermore,  the  IBM 
G-level  assembler  has  a  number  of  features  which  tend  to  slow  it  down.  Con- 
sequently, development  costs  were  also  higher  than  they  might  have  been.  For 
example,  assembling  PASS2,  the  longest  routine  of  the  program,  during  prime 
time  costs  over  $10.  The  entire  program  costs  about  $30  to  assemble.  Natu- 
rally long  runs  were  made  at  times  when  lower  rates  were  in  effect. 

Whether  the  additional  time  and  cost  involved  in  developing  an 
efficient  assembler  was  worthwhile  or  not,  will  only  be  shown  when  the  D-Machine 
has  been  in  full  operation  for  some  time. 
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3.2  Assembler  Overview 

The  S-Language  Assembler  is  a  two  pass  assembler.   Pass  one  in 
general  builds  the  intermediate  file  and  generates  the  symbol  table.  Pass 
two  does  the  balance  of  the  work  of  the  assembler,  namely,  resolving  address 
specifications  involving  variables  and  actually  generating  the  object  code 
and  program  listing. 

Strictly  speaking,  the  symbol  table  is  not  completely  generated  by 
pass  one  since  it  is  never  empty.  The  reason  for  this  is  that  the  reserved 
storage  locations  (see  section  2.h)    are  usually  referred  to  by  name  and 
consequently  their  definitions  are  a  permanent  part  of  the  symbol  table. 

Most  of  the  pseudo  operations  are  handled  in  pass  two  but  some  like 
EQJJ  and  OPTION  statements  are  the  work  of  pass  one. 

In  the  subsequent  text  a  distinction  must  be  made  between  "pass  one" 
and  "pass  two"  which  distinguish  the  two  separate  times  that  the  statements 
are  examined  by  the  assembler,  and  "PASS1"  and  "PASS2"  which  are  names  for 
routines  in  the  program.  The  above  convention  will  be  held  throughout  the 
remainder  of  this  paper. 

To  see  how  the  various  routines  are  used  in  the  processing  of  a 
program  we  will  trace  the  path  of  a  single  instruction  through  the  assembler. 
Let  us  assume  that  the  instruction  is  the  following  word  instruction: 

LBL       ADD       A,B       ;   THIS  IS  AN  ADD 

PASS1  first  checks  to  see  if  there  is  a  label.   Finding  one,  it 
hashes  LBL  into  the  symbol  table  and  inserts  the  current  value  of  the  location 
counter  and  the  statement  number,  into  the  symbol  table  too.  At  this  time  a 
check  is  made  to  make  sure  that  the  mnemonic  starts  in  the  proper  column.  It 
then  takes  the  mnemonic  ADD  and  finds  it  in  the  table  of  legal  mnemonics.  Since 
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ADD  is  the  first  instruction  in  the  list,  its  offset  from  the  start  of  the 
table  is  zero.   Therefore  column  nine  of  the  card  image  (in  core)  is  set  to 
zero. 

It  looks  into  another  table  (at  an  offset  of  zero)  to  determine  the 
length  of  the  instruction,  for  ADD  it  is  four  (words).   The  location  counter 
is  incremented  by  this  amount.  The  card  image  is  then  stored  in  the  next 
available  position  in  the  intermediate  file  by  calling  the  STORE  routine. 

As  soon  as  all  cards  are  in  the  intermediate  file,  PASS1  is  finished 
and  PASS2  of  the  program  is  called. 

PASS2  gets  the  first  card  image  of  the  next  statement  to  be  proces- 
sed by  calling  the  LOAD  routine.   It  then  clears  the  area  reserved  for  object 
code  so  that  OR  instructions  may  be  used  to  set  bits  in  the  code.   It  then 
checks  the  status  of  column  nine  of  the  card  image.   Since  column  nine  con- 
tains a  number  less  than  the  smallest  error  number,  (errors  are  FF  to  DC  in 
hexadecimal)  it  finds  the  length  of  the  instruction  in  the  same  table  used 
by  pass  one.   It  then  looks  at  the  appropriate  entry  in  a  branch  address 
table  (zero  offset  for  ADD)  and  goes  to  the  proper  section.   In  our  case  it 
is  to  the  word  instruction  section.   If  we  were  following  a  string  or  l/O 
instruction  the  branch  would  be  to  a  routine  outside  of  PASS2,  namely, 
STRINGS  or  INOUT  respectively. 

In  the  word  instruction  section,  the  OP  field  for  an  ADD  is  moved 
to  the  first  word  of  object  code.   The  next  step  is  to  call  the  SCAN  routine 
and  rip  off  operand  A.   The  EVALUATE  routine  is  then  called  to  process  A.   It 
first  looks  up  A  in  the  symbol  table  (using  the  same  hashing  technique  as 
PASSl)  and  finds  its  value  (i.e.  location  in  core).   It  stores  this  in  word 
two  of  the  object  code  and  sets  bits  one  and  two  of  word  one  (the  OP  field) 
to  indicate  that  operand  one  is  a  straight  address.   EVALUATE  then  returns  to 
PASS2 . 


PASS2  calls  SCAN  to  get  operand  B,  then  calls  EVALUATE  to  process  it 
similarly  with  the  way  it  handled  A.   From  information  passed  by  SCAN  with 
operand  B,  PASS2  sees  that  operand  three  is  missing.   It  checks  the  indirect 
bits  of  operand  one  (bits  one  and  two  of  word  one)  and  seeing  that  A  was  of 
a  correct  form,  duplicates  word  two  (address  of  A)  into  word  four  (the  third 
operand) . 

The  object  code  for  the  instruction  is  now  complete.   PASS2  prints 
out  the  location  counter,  the  object  code,  the  statement  number  and  the  card 
image  in  the  program  listing.   It  then  branches  to  the  object  code  punch  rou- 
tine address  which  was  either  set  in  PASS1  through  an  OPTION  statement,  or 
remains  at  NOPUNCH  by  default.   This  completes  PASS2's  work.   It  increments 
the  location  counter  by  the  length  which  it  found  earlier  and  goes  to  get  the 
next  card  image  from  the  intermediate  file. 

In  case  an  error  is  found  in  SCAN  or  EVALUATE,  each  routine  has  an 
error  return  which  causes  control  to  be  passed  to  an  error  handling  section  of 
PASS2.   This  will  cause  the  card  image  to  be  printed  out  with  error  message 
below,  based  on  the  error  message  number  stored  in  column  nine  of  the  card 
image.   For  a  list  of  these  errors  see  Figure  10.   The  location  counter  would 
be  incremented  by  an  amount  equal  to  the  length  found  earlier  and  the  next 
card  image  would  be  sought  in  the  intermediate  file . 

3  «3  Column  Nine 

When  the  assembler  was  first  being  written  it  was  thought  that  some 
day  the  intermediate  file  might  be  written  out  on  disk  instead  of  being  kept 
in  core.   Since  the  location  of  the  file  could  change,  it  was  desirable  to  keep 
the  amount  of  data  transferred  to  and  from  the  file  a  number  of  full  words  in 
length.  A  single  card  image  (80  columns  long)  is  20  words  in  length.   Trans- 
ferring a  single  card  image  at  a  time  would  be  fine,  except  that  certain 


Contents  in  Hex 


1+2 


Purpose 


FF 
FE 
FD 

FC 
FB 
FA 
F9 
F8 
F7 

f6 
F5 

Fk 

F3 
F2 
Fl 
FO 
EF 
EE 
ED 
EC 
EB 
EA 
E9 
E8 
E7 

e6 
E5 

E*+ 

E3 
E2 
El 
EO 
DF 
DE 
DC 
1+0 

Anything  Else 


Column  nine 

Label  contains  illegal  character 

Label  has  imbedded  blank 

Mnemonic  does  not  begin  in  column  10 

Undefined  mnemonic 

Multiple  defined  label 

Column  one  of  label  not  alphabetic 

Continuation  card  starts  before  column  19 

Illegal  continuation  attempted 

Unmatched  apostrophe  in  operand 

Operand  longer  than  133  characters 

Operand  missing 

Label  in  operand  too  long 

Incorrect  form  for  an  offset 

Label  is  undefined 

Illegal  operand 

Literal  is  too  long 

Error  in  numeric  literal 

Negative  address 

Illegal  first  operand  type  with  missing  operand 

Operand  value  too  big  for  l6  bits 

Illegal  use  of  apostrophe  in  literal 

Illegal  literal 

Illegal  double  indirect  addressing 

No  label  on  an  EQU 

Truncation  in  CONSTANT  (warning) 

Card  has  been  printed 

Card  image  is  a  fake,  all  blank 

Illegal  indirect  addressing 

Too  many  operands  in  a  field 

Too  few  operands  in  a  field 

Field  is  missing 

Too  many  fields 

Illegal  i/o  device 

No  blank  before  semicolon 

Blank  card,  real  one 

Position  of  mnemonic  in  table 


Figure  10.   Conditions  of  Column  Nine 
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information  concerning  each  card  image  is  useful  to  keep  in  the  intermediate 
file.  Such  things  as  errors  found  in  pass  one,  the  number  of  the  mnemonic  if 
the  card  is  an  instruction,  and  whether  the  card  has  been  printed  yet,  are  all 
useful. 

Since  the  intermediate  file  was  to  be  in  core  for  at  least  some  time 
it  was  also  desirable  not  to  use  any  more  core  than  necessary.  To  allow  even 
a  single  extra  word  for  each  card  image  would  increase  the  size  of  the  inter- 
mediate file  by  almost  6k.  This  is  why  a  method  for  coding  the  information 
into  the  original  80  columns  was  developed.   By  requiring  the  user  to  punch 
the  mnemonic  in  column  10  on  all  instruction  cards,  and  further  by  placing  a 
limit  of  eight  characters  on  all  labels,  it  was  guaranteed  that  column  nine  of 
all  statements  except  comments  would  be  unused.  This  could  then  be  used  for 
storing  information  about  each  card. 

Only  one  piece  of  information  is  stored  at  any  one  time,  but  it  was 
found  that  this  was  sufficient.  Most  statements  use  the  column  twice,  once  to 
store  the  mnemonic  number,  found  in  pass  one,  and  once  to  indicate  the  card  has 
been  printed. 

The  codes  for  column  nine  are  listed  in  Figure  10. 

3-h     Symbol  Table  Management  and  Hashing  Scheme 

The  information  about  all  the  labels  is  stored  in  the  symbol  table. 
The  table  is  1021  entries  long  and  each  entry  requires  three  words  (32  bits 
each)  of  storage.   This  is  360  storage  since  the  table  need  be  present  during 
assembly  of  the  program  and  not  execution.   For  the  form  of  a  symbol  table 
entry  see  Figure  11. 

Since  6k   register  definitions  are  in  the  symbol  table  at  all  times, 
a  S-Language  program  may  contain  at  most  957  labels.  This  should  be  sufficient 
for  any  1500  card  program. 


kk 


Words  1  and  2 

31-28 

27-16 

15-0 

Bits 

Words  1  and  2 

31-28 

27-16 

15-0 


Usage 

Contain  label 

Unused 

Statement  number  where  label  is  defined 

Value  of  label 


Figure  11.   Form  of  Symbol  Table  Entry 

The  purpose  of  the  hashing  routine  is  to  take  any  legal  label  and  to 
associate  with  this  label  one  of  the  entry  positions  in  the  symbol  table,  in 
our  case  one  of  the  1021  locations.   This  process  is  called  hashing.   It  is 
desirable  for  different  labels  to  hash  into  different  locations  as  often  as 
possible.  When  two  labels  fail  to  find  unique  spots  it  is  called  a  collision. 

The  goal  of  a  good  hashing  scheme  is  to  perform  the  hash  function  as 
quickly  as  possible  but  yet  to  minimize  the  number  of  collisions. 

The  first  step  of  the  hashing  scheme  is  to  reduce  the  label  to  a 
single  32  bit  word.   This  is  done  by  considering  the  label  to  be  two  words 
long  {k   characters  per  word  on  the  IBM  360)  and  by  multiplying  the  two  words 
together.   If  the  label  is  less  than  five  characters  long,  the  label  is  already 
contained  in  a  single  word,  in  which  case  it  is  squared. 

The  above  process  which  produces  a  6k   bit  result,  gives  a  good 
distribution  to  the  middle  32  bits  [k] .     These  32  bits  are  then  divided  by  the 
size  of  the  table  (1021) .   The  remainder  of  the  process  is  used  as  the  entry 
position.   It  is  always  between  0  and  1020,  so  it  is  used  as  an  offset. 


When  a  collision  occurs  the  label  must  be  rehashed  to  a  new  location. 
The  assembler  uses  what  is  called  a  linear  rehash,  which  is  a  fancy  name  for  a 
sequential  search  of  the  rest  of  the  table.   This  is  not  the  best  rehash  scheme, 
but  is  a  good  one  if  the  table  is  not  densely  populated.   This  was  the  justi- 
fication for  its  use  in  the  S-Language  Assembler,  since  a  1500  card  program 
should  never  use  anywhere  near  1021  labels. 

3 .5   Subroutine  Descriptions 

The  following  section  includes  a  description  of  each  of  the  11 
routines  comprising  the  program. 

PASS! 

PASS1 ' s  primary  role  is  to  build  the  symbol  table.   In  order  to  do 
this  it  must  keep  track  of  the  effect  of  each  instruction  on  the  location 
counter.   Therefore,  CONSTANT  and  STOPAGE  instructions  must  have  their  lengths 
established  and  ORG  instructions  must  also  be  evaluated  at  this  time.   Natu- 
rally since  they  introduce  entries  in  the  symbol  table,  EQUs  are  handled  in 
PASS1  too.   If  an  OPTION  statement  occurs  in  the  program  it  is  taken  care  of 
here  also. 

The  following  errors  are  checked  for  in  PASS1:   undefined  mnemonic, 
incorrect  labels,  mnemonic  not  starting  in  column  10,  labels  not  starting  in 
column  1,  labels  being  defined  more  than  once  and  incorrect  form  for  operands 
of  the  pseudo-operations,  mentioned  above,  which  are  evaluated  in  this  routine. 

Mnemonics  are  found  by  doing  a  binary  search  on  a  table  called 
STARTOPS.   If  a  mnemonic  is  found  in  the  table,  the  offset  into  the  table  is 
stored  in  column  9  of  the  card  for  later  use.  This  offset  is  also  used  in  a 
table  called  OFFSETS.   This  table  has  two  entries  for  each  legal  instruction. 
The  first  entry  contains  an  offset  in  a  branch  table  and  the  second  entry  is 


1+6 
the  length  of  the  instruction  in  words.  It  is  this  last  entry  which  is  stored 
in  location  LENGTH,  to  serve  as  an  increment  to  the  location  counter.  In- 
structions which  have  a  variable  length,  like  CONSTANT,  STORAGE  and  ORG,  have 
a  zero  length  in  the  table,  but  fill  in  LENGTH  when  their  operands  are  evalu- 
ated. 

The  branch  table  (BTABLE)  is  used  after  the  offset  and  length  of  an 
instruction  have  been  established.  This  table  sends  the  instruction  to  one 
of  the  following  sections :   SAWB  strings  and  word  instructions,  SACB  storage 
and  constants,  END  end  statements,  EQUB  equs,  OPTION  options,  ORGB  orgs  and 
NOERROR  for  page  ejects. 

The  hashing  scheme  described  in  section  J.k   may  be  found  at  a 
location  named  HASH. 

The  intermediate  file  is  generated  during  PAS SI  by  calling  the  STORE 
routine.   This  has  the  effect  of  loading  the  contents  of  the  80  characters 
starting  at  location  CARD  into  the  intermediate  file. 

The  symbol  table  is  part  of  PASS1  and  is  called  SYMBLTBL.  It  contains 
zeros  for  entries  not  used. 

When  PASS1  is  finished  it  calls  PASS2.   PASS2  eventually  returns  to 
PASS1  and  then  the  run  is  terminated.   Following  is  the  register  usage  for  PASS1, 


Register 

Usage 

R15 

Branch  address 

Blk 

Return  address 

R13 

Save  area  pointer 

R12 

Base  register 

RIO 

Statement  number 

R9 

Card  number 

R8 

Location  counter 

R7 

Internal  return  point 

^7 

PASS2 

PASS2  does  most  of  the  work  in  the  assembler.   It  does  all  the 
printing  for  the  program  listing  and  opens  and  closes  all  the  output  DCBs. 

As  each  card  is  taken  from  the  intermediate  file  using  the  LOAD 
routine,  it  is  first  checked  to  see  if  it  is  a  comment.   If  it  is  not  a 
comment  then  column  9  is  checked  for  one  of  two  things,  either  the  error 
number  of  an  error  found  in  PASS1  or  the  mnemonic ' s  number.   If  it  is  the 
former,  then  it  branches  to  the  error  routine  at  BADBAD.   If  the  latter,  then 
it  finds  an  offset  in  a  table  called  BTBL  and  uses  this  offset  in  a  table  of 
branch  addresses  called  BADDS.   There  is  an  individual  section  of  PASS2  for 
each  string  instruction  and  each  pseudo-operation.   There  is  also  a  section 
for  word  instructions. 

Word  instructions  use  a  table  called  BITPAT  extensively  to  generate 
their  object  code.   There  is  an  entry  in  BITPAT  for  each  instruction,  but  it  is 
meaningless  for  all  but  word  instructions.   This  enables  PASS2  to  use  the 
original  offset  from  the  mnemonic  table  in  PASS1  as  the  offset  in  BITPAT.   Each 
word  instruction  has  five  entries  in  BITPAT.  The  first  two  are  the  OP  field  for 
the  first  word  of  the  object  code.   See  Figures  2-k-.      The  last  three  entries 
give  the  status  of  the  three  possible  operands.   For  instance,  the  second  and 
third  operands  of  the  JUMP  instruction  may  not  be  present,  this  is  indicated  in 
the  table.   Similarly,  the  first  operand  for  convert  to  binary  must  be  a  byte 
address  and  this  is  also  indicated. 

Each  word  instruction  looks  at  the  specification  for  each  operand  and 
then  takes  the  correct  action  for  that  type  operand.  Generally  it  involves 
calling  SCAN  to  rip  off  the  operand  and  then  calling  EVALUATE  to  set  the  bits 
for  the  operand. 

String  instructions  merely  go  to  a  section  which  calls  the  STRINGS 
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routine.   Each  instruction  has  a  separate  entry  point  in  STRINGS  so  it  makes 
sense  to  have  the  few  extra  lines  of  code  for  the  calling  sequence  duplicated 
rather  than  using  one  call  to  STRINGS  and  having  to  find  the  instruction  type 
in  that  routine.  STRINGS  generates  all  the  object  code  for  string  instructions. 

After  the  object  code  is  generated  for  a  string  or  word  instruction, 
control  is  passed  to  a  section  called  INSTCOMP.   This  section  loads  the  object 
code  into  a  buffer  which  holds  as  much  object  code  as  can  fit  on  an  object 
deck  card.   This  amount  differs  from  l6  bytes  for  the  simulator  to  32  bytes 
for  the  D-Machine.   If  necessary,  the  buffer  is  output  by  calling  DPUNCH  or 
SIMPUNCH. 

The  object  code  is  then  converted  to  EBCDIC  and  output  on  the  listing 
along  with  the  card  image.   This  is  done  in  a  section  called  PRTCARD.  After  the 
location  counter  is  incremented,  the  next  card  is  removed  from  the  intermediate 
file . 

Constant  and  storage  instructions  must  be  reevaluated  in  PASS2  since 
their  lengths  are  not  saved.  Furthermore,  the  individual  operands  from  the 
constant  instruction  are  evaluated.   The  process  is  as  follows:   SCAN  is  called 
to  rip  off  the  next  operand  then  CONEVL,  a  section  of  EVALUATE  is  called  to 
develop  the  object  code.   PASS2  puts  the  object  code  into  the  space  reserved 
for  it  though,  unlike  regular  instructions  which  let  EVALUATE  or  STRINGS  handle 
the  task.  There  is  a  separate  section  devoted  to  just  printing  out  CONSTANT 
instructions,  it  is  called  INST.1F  and  INST1B,  depending  on  whether  the  last 
operand  required  a  full  word  or  could  end  on  a  half  word  (byte) . 

All  routines  called  by  PASS2  have  an  error  return  provision.   If  an 
error  is  detected  the  return  is  to  the  address  passed  in  general  register  1^. 
If  no  error  is  detected  the  return  is  to  this  address  plus  four  bytes.  There- 
fore, the  first  instruction  after  a  subroutine  call  is  always  an  unconditional 
branch  to  the  error  handling  section. 
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Error  numbers  are  stored  in  column  nine  of  the  current  card  image. 
See  Figure  10.  The  error  section  prints  the  card  image,  then  passes  control 
to  a  section  named  EMSGP  to  print  the  error  message.   The  buffer  is  always 
punched  after  an  error  and  a  number  of  zeros  equivalent  to  the  length  of  the 
current  instruction  are  punched  for  the  simulator.  No  object  code  is  punched 
for  the  D-Machine  from  a  statement  found  to  be  in  error. 

Input-output  is  handled  in  much  the  same  way  string  instructions  are. 
The  only  difference  is  that  the  routine  called  is  INCUT. 

The  output  listing  allows  60  lines  per  page.  Register  11  contains 
the  lines  printed  on  the  page  at  any  given  time.   Before  any  line  is  output  a 
check  is  made  of  register  11,  and  if  greater  than  or  equal  to  60  it  goes  to  a 
new  page  before  printing  the  line.  EJECT  statements  simply  cause  register  11 
to  be  set  to  60. 

After  an  END  card  is  found  or  when  the  last  card  in  the  intermediate 
file  is  processed  by  PASS2,  if  cards  for  the  D-Machine  are  being  punched,  a 
card  giving  the  starting  execution  point  is  punched.  It  has  the  same  format 
as  the  object  deck  cards  except  column  five  contains  an  'E'  and  the  address  on 
the  card  is  the  position  for  execution  to  begin.  For  a  description  of  the  END 
card  see  section  2.8. 

After  a  program  has  been  processed  by  the  assembler,  the  symbol  table 
is  printed  out  in  alphabetical  order.   To  alphabetize  the  table,  a  linear  search 
is  made  of  the  possible  102.1  entries.   Every  time  a  nonblank  entry  is  found,  the 
address  of  the  entry  is  stored  in  the  next  available  location  on  top  of  the 
intermediate  file.  A  bubble  sort  is  then  used  to  sort  the  addresses  according 
to  the  labels  they  point  to.   The  information  printed  includes  the  label,  the 
statement  number  at  which  it  was  defined  and  the  label's  value. 

After  the  symbol  table  is  printed,  the  DCBs  are  closed  and  control  is 
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passed  back  to  PASS1.  Following  is  the  register  usage  for  PASS2, 


Register 

Usage 

R15 

Branch  address 

Blk 

Return  address 

R13 

Save  area  pointer 

KL2 

Base  register 

ELI 

Lines  printed  on  page 

RIO 

Statement  number 

R9 

Location  in  intermediate  file 

r8 

Base  register 

RT 

Internal  return  point 

R6 

Points  to  CARD 

EVALUATE 

EVALUATE  handles  the  operands  for  all  instructions.  For  certain 
types  of  instructions  it  returns  a  value,  while  for  others  it  actually  sets 
the  bits  in  the  object  code.  Each  type  of  instruction  has  a  separate  entry 
point  in  the  routine.  Word  instructions  use  EVALUATE,  EQUs  use  EQUEVL, 
STORAGE  uses  SCEVL,  CONSTANTS  use  CONEVL  and  strings  use  STRINGVL. 

The  sections  of  code  at  each  of  these  entry  points  use  common 
routines  to  process  the  various  types  of  operands.   Straight  addresses,  either 
base  10  or  a  label,  with  or  without  an  offset,  are  handled  by  a  section  called 
LEVAL.   It  returns  the  result  in  general  register  11.   If  a  label  is  used  a 
symbol  table  lookup  is  used.   The  hash  routine  described  in  section  J>  .h   may  be 
found  at  location  HASH. 

To  determine  the  magnitude  of  a  base  10  number  in  EBCDIC  a  routine 
called  VALUE  is  used.   It  returns  the  value  in  general  register  7* 
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Character  literals  are  handled  by  a  routine  called  ALPHALIT.  When 
called  the  literal  should  be  in  location  OPERAND  of  PASS2.   On  return  the 
literal  is  still  in  OPERAND,  but  leading  and  trailing  apostrophes  have  been 
removed  and  double  apostrophes  in  the  literal  itself  have  been  reduced  to 
single  ones.  The  literal  is  no  longer  in  EBCDIC  though;  it  has  been  converted 
to  ASCII.  The  length  of  the  literal  is  in  general  register  one. 

Byte  addresses  use  a  section  called  BADDS.   This  section  in  turn 
uses  either  LEVAL  or  VALUE  to  get  the  straight  address,  then  converts  it  to 
the  appropriate  byte  address.   The  result  is  returned  in  general  register  11. 

For  word  instruction  operands  the  bits  in  the  object  code  are  actually 
set  in  EVALUATE.   EQUs  and  STORAGE  instructions  simply  return  the  value  of  the 
operand  (in  general  register  0) .   CONSTANT  instructions  have  two  possible 
operand  types,  full  words  and  bytes.   If  the  operand  occupies  a  full  word  the 
result  is  returned  in  general  register  one  and  the  return  is  the  address  in 
general  register  Ik   plus  four  bytes.  For  operands  which  may  end  on  a  half  word 
(byte)  the  result  is  in  location  OPERAND  and  the  length  of  the  result  (in  bytes) 
is  in  general  register  one.  The  return  is  to  the  address  in  general  register  Ik 
plus  eight  bytes.   This  means  that  the  call  to  CONEVL  is  followed  first  by  a 
branch  to  an  error  routine,  then  by  a  branch  to  a  full  word  section  and  then 
the  code  for  the  byte  operands. 

String  instruction  operands  are  either  a  register  or  an  update  in- 
dicator.  For  a  list  of  the  arguments  and  values  returned  see  below. 

Input  Value  Returned 

Register  n         Core  location  of  register  n 

^Register  n         Complement  of  core  location  of 

register  n 
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Input  Value  Returned 


*0 

-2 

0 

0 

+1 

1 

1 

1 

-1 

-1 

The  value  is  returned  in  general  register  1. 

The  register  usage  for  EVALUATE  varies  for  each  type  of  instruction 
being  worked  upon. 

SCAN 

The  SCAN  routine  removes  the  next  operand  from  the  current  card  image. 
Upon  calling  SCAN,  POINTER,  a  location  in  SCAN,  should  have  the  address  of  the 
first  character  of  the  next  operand,  or  any  blank  character  in  front  of  the 
operand. 

SCAN  -will  stop  at  a  comma,  a  semicolon  preceded  by  a  blank,  or  at  the 
first  of  a  series  of  blanks  which  finish  off  a  statement.  For  string  instruc- 
tions only,  SCAN  will  stop  at  a  single  blank  character.   The  operand  is  moved 
to  location  OPERAND  in  PASS2.  Upon  return,  POINTER  has  the  address  of  the 
comma  if  the  operand  is  followed  by  one,  the  address  of  the  first  following 
blank  if  the  operand  is  the  last  in  a  statement  or  the  first  character  in  the 
next  field,  if  it  is  the  last  operand  in  a  field  in  a  string  instruction.  The 
return  is  to  the  address  passed  in  general  register  1^  plus  four  bytes  for  the 
first  two  cases  and  to  the  address  in  general  register  Ik-   plus  eight  bytes  for 
the  last  case. 

The  instruction  after  the  call  to  SCAN  is  a  branch  to  an  error  handling 
section.   The  error  return  for  SCAN  is  to  the  address  in  general  register  1^. 
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INPUT 

INOUT  generates  the  object  code  for  input /output  instructions.   PASS2 
sets  the  bits  in  the  Device  Block  which  specify  whether  it ' s  a  read  or  write 
operation.   It  is  INOUT' s  job  to  set  the  rest  of  the  bits  in  the  three  words  of 
the  instructions.   For  the  bit  patterns  of  the  i/O  instructions  see  Figures  5-7' 

The  first  task  of  INOUT  is  to  determine  the  device  type.  There  is  a 
separate  section  for  DISK,  CARD,  LINE  and  TTY.  Each  section  uses  a  common 
routine  for  setting  the  bits  for  the  length  operand.   The  last  three  devices 
also  use  a  common  section  for  their  address  operand.   This  section  is  not  used 
for  the  disk  because  the  disk  uses  word  addressing  while  the  others  use  byte 
addressing. 

The  normal  return  is  to  the  address  passed  in  general  register  lk 
plus  four  bytes. 

STRINGS 

STRINGS  is  a  subroutine  to  generate  object  code  for  the  string  in- 
structions.  Each  string  instruction  has  a  separate  section  of  code,  with 
certain  bit  patterns  being  generated  by  common  routines. 

One  such  routine  RIPOFF,  finds  all  the  operands  in  a  field,  calls 
the  routine  STRINGY!  to  evaluate  them,  stacks  the  results  in  STACKR  and  puts 
the  number  of  operands  found  in  general  register  6. 

The  routines  A2TW0  and  A20NE  set  the  bits  for  word  1  of  all  string 
instructions.  This  word  controls  the  A2  register  referred  to  in  Yamada's 
thesis  [1] . 

SETA3  sets  bits  7-10  of  word  2.   It  indicates  which  PTR  register  is 
loaded  into  A3-   SETA1  sets  bits  11-1.5  of  word  2,  which  contain  the  Al  load 
control . 
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Bits  0-10  of  word  3,  the  jump  control  one  bits,  are  set  by  SETJC1. 
Bits  11-15  of  word  3,  controlling  the  counter,  are  set  by  SETCTR. 

The  bits  of  word  h,  the  second  and  third  jump  controls,  are  set  by 
SETJC2  and  SETJC3,  respectively. 

For  a  description  of  the  bit  patterns  for  the  string  instructions, 
see  pages  11-18  of  Yamada's  thesis  [l] . 

STORE 

The  STORE  subroutine  looks  at  general  register  nine  which  contains 
the  card  number,  and  checks  to  see  if  the  intermediate  file  is  full.   If  it  is 
full,  it  abends  with  a  user  1  abend  number.   Otherwise  it  loads  the  80  charac- 
ters beginning  at  location  CARD  of  PASS1  into  the  next  spot  in  the  intermediate 
file . 

LOAD 

LOAD  takes  the  number  in  general  register  nine,  assumes,  it  is  a  valid 
entry  number  in  the  intermediate  file,  and  loads  the  80  characters  at  that 
position  past  the  start  of  the  intermediate  file,  into  location  CARD  of  PAS SI. 

PUNCH 

The  PUNCH  routine  punches  the  80  columns  beginning  at  location  OUTBUF. 
If  entered  at  NOPUNCH,  it  punches  nothing.   The  latter  is  used  for  the  default 
option  if  no  OPTION  card  is  found  in  a  program. 

DPUNCH 

The  DPUNCH  routine  controls  the  punching  of  object  cards  for  the  D- 
Machine.   It  first  subtracts  the  address  of  the  beginning  of  the  object  code 
buffer  (in  ABUF)  from  the  current  buffer  pointer  position  (in  ABUFPTR),  giving 
the  number  of  bytes  of  object  code  to  go  on  the  current  card.  It  converts  this 
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to  the  number  of  l6  bit  words.  If  this  number  is  greater  than  l6,  it  abends 
with  a  user  k   abend  number.  Otherwise  it  updates  the  starting  location  for 
the  next  card,  stored  in  location  CSTART  in  PASS2.   It  then  converts  the  binary- 
code  to  its  hexadecimal  equivalent  and  stores  it  in  location  OUTBUF  of  PUNCH, 
in  the  column  pattern  described  in  section  2.3  •  Finally,  it  calls  PUNCH  to 
output  the  card. 

To  be  compatible  with  the  simulator  the  first  word  of  the  routine 
contains  the  offset  to  a  section  called  FAKE.  When  a  STORAGE  instruction  is 
found  by  the  assembler,  it  must  punch  zeros  if  the  simulator  is  being  used. 
Rather  than  check  for  this,  both  punch  routines  have  a  section  for  this  occur- 
rence. Here,  though,  the  action  taken  is  to  return  without  punching  anything, 
since  this  is  for  the  D-Machine. 

The  normal  call  to  this  routine  is  to  branch  to  the  address  four 
bytes  past  DPUNCH.   The  zero  punch  call  gets  the  address  of  the  routine  (DPUNCH) 
and  adds  to  it  the  number  at  this  location.   This  gives  it  the  address  of  FAKE. 
Branching  to  FAKE  causes  a  return. 

SIMPUNCH 

SIMPUNCH  works  almost  identically  the  same  way  as  DPUNCH  does.   The 
object  cards  for  the  simulator  hold  only  eight  D-Machine  words,  and  the  abend 
is  user  number  3,  if  more  than  eight  are  called  for.   The  column  pattern  for  the 
object  card  is  given  in  section  2.3 • 

There  is  a  zero  punch  section  for  the  simulator.   It  is  called  STOR 
and  when  called  it  expects  general  register  0  to  contain  the  number  of  words 
of  zeros  to  be  generated.   This  section  also  updates  the  starting  location  for 
the  code  on  the  next  card  (in  CSTART)  and  calls  PUNCH  to  do  the  punching.   This 
action  may  be  taken  more  than  once,  if  more  than  eight  words  of  zeros  are  to  be 
punched. 
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3.6  DUMP  Instruction 

If  you  look  carefully  at  the  mnemonics  accepted  in  PASS1  you  will 
find  one  called  DUMP  which  has  not  been  described  yet.  Its  use  is  for  de- 
bugging purposes  only.  When  a  DUMP  instruction  is  found  by  PASS2  it  causes 
an  abend,  user  number  100.  This  causes  a  dump  which  enables  the  programmer  to 
see  what  the  state  of  PASS2  is  at  that  point.   This  statement  is  usually  used 
just  after  a  card  which  is  not  working  properly.   Only  one  DUMP  card  per  pro- 
gram will  work,  since  the  abend  will  terminate  the  run. 

3.7  Making  Changes  in  the  Program 

This  section  will  describe  how  to  make  three  changes  in  the  assembler, 
increasing  the  acceptable  operand  length,  lengthening  the  intermediate  file  and 
adding  a  new  instruction. 

Throughout  the  user's  guide  (Chapter  2)  you  will  notice  that  there 
are  restrictions  on  how  long  literals  may  be.  For  instance,  character  literals 
may  only  be  133  characters  long.  It  is  anticipated  that  this  will  be  long 
enough  to  satisfy  most  users'  needs.   If,  however,  it  becomes  necessary  to 
change  this  sometime  in  the  future,  it  will  be  easy  to  do.   In  PASS2,  OPERAND 
is  defined  as  133  characters  long  (133C) .  Right  after  OPERAND  is  OL  which  is 
equated  to  one  less  than  the  length  of  OPERAND,  and  OPLENGTH  which  is  an  address 
containing  the  length  of  OPERAND.   Both  of  these  variables  are  set  automatically 
when  the  length  of  OPERAND  is  specified.  All  references  in  the  program  to  the 
maximum  length  of  an  operand  use  OPLENGTH.  Therefore,  if  OPERAND  is  lengthened 
it  will  still  remain  compatible  with  the  rest  of  the  program,  so  it  could  be 
made  256,  or  anything  else  if  it  is  more  convenient. 

In  the  user ' s  guide  it  was  mentioned  that  the  maximum  length  of  a 
program  is  1500  cards.  Anything  longer  causes  a  user  abend  1.   This  is  due  to 
the  size  of  the  intermediate  file.  The  intermediate  file  is  contained  in  the 
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STORE  routine.  At  present  it  is  120,000  characters  long.   It  may  be  increased 
or  decreased  in  length  by  changing  both  INTSIZE  and  INTFILE  in  the  STORE  rou- 
tine. 

It  is  anticipated  that  several  new  instructions  will  be  introduced 
after  the  D-Machine  is  in  use  for  some  time.   To  see  how  to  add  an  instruction 
to  the  assembler,  assume  we  have  a  multiply  instruction  with  an  OP  code  of  50. 
This  will  naturally  be  a  word  instruction,  and  requires  the  user  to  specify 
two  or  three  operands.  As  a  convenience  we  will  allow  the  user  to  specify  the 
mnemonic  as  either  MULT  or  MULTIPLY. 

First,  we  must  add  the  mnemonics  to  PASS1.  We  do  this  by  putting 
MULT  and  MULTIPLY  into  the  STARTOPS  list,  between  MOVER  and  NAND.   They  are 
now  recognized  instructions.   Since  this  will  be  a  word  instruction  the  branch 
address  we  want  from  BTABLE  is  the  zero  entry  SAWB  (string  and  word  branch). 
The  length  of  our  instruction  will  be  four  words  so  the  entries  in  the  table 
OFFSETS  should  be: 

DC  FLl'0',FLlV  MULT 

DC  FL1 ' 0 ' ,  FL1 ' k '  MULTIPLY 

Again  this  should  be  between  MOVER  and  NAND.   This  is  all  that  is  required  for 
PASS1 . 

In  PASS2  we  must  first  set  the  bit  pattern  in  BITPAT.   Since  the  only 
bits  always  on  in  the  OP  field  for  our  instruction  will  be  the  OP  code,  (bits 
5-11),  the  bit  pattern  in  hex  for  the  OP  field  will  be  06^0  (see  Figures  2-^). 
The  first  two  operands  must  be  present  and  the  third  is  optional.   These  are 
represented  by  hex  kO   and  10,  respectively.   Therefore,  the  new  entries  in 
BITPAT  are: 
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DC       X'C^'jX'UO'jX'l+O'jX'l+O'jX'lO'  MULT 

DC       X'Oe^X'^O'jX'l+O'jX'lj-O'jX'lO'  MULTIPLY 

Again,  this  goes  between  MOVER  and  MED. 

The  branch  address  we  want  in  BADDS  is  WORDS,  which  is  at  an  offset 
of  zero.   The  entries  in  BTBL,  therefore,  should  be: 

DC       X ' 00 '  MULT 

DC       X'OO1  MULTIPLY 

This  is  all  that  is  required  to  completely  specify  an  instruction. 
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